A general framework for multicharacter segmentation and its application in recognizing multilingual Asian documents

نویسندگان

  • Di Wen
  • Xiaoqing Ding
چکیده

In this paper we propose a general framework for character segmentation in complex multilingual documents, which is an endeavor to combine the traditionally separated segmentation and recognition processes into a cooperative system. The framework contains three basic steps: Dissection, Local Optimization and Global Optimization, which are designed to fuse various properties of the segmentation hypotheses hierarchically into a composite evaluation to decide the final recognition results. Experimental results show that this framework is general enough to be applied in variety of documents. A sample system based on this framework to recognize Chinese, Japanese and Korean documents and experimental performance is reported finally.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Title of dissertation : ADAPTIVE ANALYSIS AND PROCESSING OF STRUCTURED MULTILINGUAL

Title of dissertation: ADAPTIVE ANALYSIS AND PROCESSING OF STRUCTURED MULTILINGUAL DOCUMENTS Huanfeng Ma, Doctor of Philosophy, 2006 Dissertation directed by: Professor Rama Chellappa Dr. David S. Doermann Electrical and Computer Engineering Department Digital document processing is becoming popular for applications to office and library automation, bank and postal services, publishing houses a...

متن کامل

Quantitative Comparison of SPM, FSL, and Brainsuite for Brain MR Image Segmentation

Background: Accurate brain tissue segmentation from magnetic resonance (MR) images is an important step in analysis of cerebral images. There are software packages which are used for brain segmentation. These packages usually contain a set of skull stripping, intensity non-uniformity (bias) correction and segmentation routines. Thus, assessment of the quality of the segmented gray matter (GM), ...

متن کامل

Hybrid Language Segmentation for Historical Documents

English. Language segmentation, i.e. the division of a multilingual text into monolingual fragments has been addressed in the past, but its application to historical documents has been largely unexplored. We propose a method for language segmentation for multilingual historical documents. For documents that contain a mix of highand low-resource languages, we leverage the high availability of hi...

متن کامل

Explaining the agency's mind in recognizing geographic space within the framework of existentialism, with emphasis on Jean-Paul Sartre's Viewpoint

In philosophy of science, particularly in the human sciences, concepts have Multiple meanings even have contradictory definition from ontological and epistemological aspects in different philosophical schools. Therefore, determination The theoretical framework for understanding of fundamental concepts, are most important in the study and understanding of the concepts in human sciences. One of t...

متن کامل

An efficient method for cloud detection based on the feature-level fusion of Landsat-8 OLI spectral bands in deep convolutional neural network

Cloud segmentation is a critical pre-processing step for any multi-spectral satellite image application. In particular, disaster-related applications e.g., flood monitoring or rapid damage mapping, which are highly time and data-critical, require methods that produce accurate cloud masks in a short time while being able to adapt to large variations in the target domain (induced by atmospheric c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004